AITopics

Technology: Information Technology > Artificial Intelligence (0.77)

Neural Information Processing SystemsFeb-11-2026, 00:26:33 GMT

SupplementaryMaterialsof TheSurprisingEffectivenessofPPOinCooperative Multi-AgentGames

We consider the 3 fully cooperative tasks from the original set shown in Figure 1(a):Spread, Comm,andReference. "Use feature normalization" refers to whether the feature normalization is applied to the networkinput. In this appendix section, we include results which demonstrate the benefit of parameter sharing. Note that our global state to the value network has agent-specific information, such as available actions and relative distances to other agents. When an agent dies, these agent-specific features become zero, while the remaining agent-agnostic features remain nonzero -this leads to adrastic distribution shift in the critic input compared to states in which the agent is alive.

artificial intelligence, easy 100, machine learning, (18 more...)

Country: North America > United States (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.93)

Neural Information Processing SystemsOct-3-2025, 07:37:41 GMT

7967cc8e3ab559e68cc944c44b1cf3e8-AuthorFeedback.pdf

agent, maddpg and qmix, supplementary material, (11 more...)

Technology: Information Technology > Artificial Intelligence (0.31)

Maruyama, Norihiro, Yoshida, Takahide, Sato, Hiroki, Masumori, Atsushi, Johnsmith, null, Ikegami, Takashi

A Concurrent Modular Agent: Framework for Autonomous LLM Agents

arXiv.org Artificial IntelligenceAug-27-2025

We introduce the Concurrent Modular Agent (CMA), a framework that orchestrates multiple Large-Language-Model (LLM)-based modules that operate fully asynchronously yet maintain a coherent and fault-tolerant behavioral loop. This framework addresses long-standing difficulties in agent architectures by letting intention emerge from language-mediated interactions among autonomous processes. This approach enables flexible, adaptive, and context-dependent behavior through the combination of concurrently executed modules that offload reasoning to an LLM, inter-module communication, and a single shared global state.We consider this approach to be a practical realization of Minsky's Society of Mind theory. We demonstrate the viability of our system through two practical use-case studies. The emergent properties observed in our system suggest that complex cognitive phenomena like self-awareness may indeed arise from the organized interaction of simpler processes, supporting Minsky-Society of Mind concept and opening new avenues for artificial intelligence research. The source code for our work is available at: https://github.com/AlternativeMachine/concurrent-modular-agent.

artificial intelligence, large language model, natural language, (20 more...)

2508.19042

Country: Asia > Japan (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Neural Information Processing SystemsAug-20-2025, 08:58:26 GMT

f1ea154c843f7cf3677db7ce922a2d17-AuthorFeedback.pdf

architecture, compression, video compression, (14 more...)

Technology: Information Technology > Artificial Intelligence (0.77)

Neural Information Processing SystemsAug-17-2025, 06:01:25 GMT

Supplementary Materials of The Surprising Effectiveness of PPO in Cooperative Multi-Agent Games

We assume here that all agents share critic and actor networks, for notational convenience. Gaussian Distribution, from which an action is sampled, in continuous action spaces. In the loss functions above, B refers to the batch size and n refers to the number of agents. Multi-agent Particle-World Environment (MPE) was introduced in (Lowe et al., 2017). StarCraftII Micromanagement Challenge (SMAC) tasks were introduced in (Rashid et al., 2019).

agent, artificial intelligence, mappo, (17 more...)

Country:

North America > United States > California > Alameda County > Berkeley (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.70)

Neural Information Processing SystemsAug-17-2025, 06:01:21 GMT

9c1535a02f0ce079433344e14d910597-Paper-Datasets_and_Benchmarks.pdf

machine learning, mappo, reinforcement learning, (15 more...)

Country:

Asia > China > Beijing > Beijing (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
(2 more...)

Genre: Research Report (0.67)

Industry: Leisure & Entertainment (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.98)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Zhaikhan, Ainur, Khammassi, Malek, Sayed, Ali H.

Policy Optimization in Multi-Agent Settings under Partially Observable Environments

arXiv.org Artificial IntelligenceAug-11-2025

This work leverages adaptive social learning to estimate partially observable global states in multi-agent reinforcement learning (MARL) problems. Unlike existing methods, the proposed approach enables the concurrent operation of social learning and reinforcement learning. Specifically, it alternates between a single step of social learning and a single step of MARL, eliminating the need for the time- and computation-intensive two-timescale learning frameworks. Theoretical guarantees are provided to support the effectiveness of the proposed method. Simulation results verify that the performance of the proposed methodology can approach that of reinforcement learning when the true state is known.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

2508.06061

Genre: Research Report (0.50)

Industry: Education > Curriculum (0.90)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

arXiv.org Artificial IntelligenceMay-20-2025

Dynamic Sight Range Selection in Multi-Agent Reinforcement Learning

Liao, Wei-Chen, Wu, Ti-Rong, Wu, I-Chen

Multi-agent reinforcement Learning (MARL) is often challenged by the sight range dilemma, where agents either receive insufficient or excessive information from their environment. In this paper, we propose a novel method, called Dynamic Sight Range Selection (DSR), to address this issue. DSR utilizes an Upper Confidence Bound (UCB) algorithm and dynamically adjusts the sight range during training. Experiment results show several advantages of using DSR. First, we demonstrate using DSR achieves better performance in three common MARL environments, including Level-Based Foraging (LBF), Multi-Robot Warehouse (RWARE), and StarCraft Multi-Agent Challenge (SMAC). Second, our results show that DSR consistently improves performance across multiple MARL algorithms, including QMIX and MAPPO. Third, DSR offers suitable sight ranges for different training steps, thereby accelerating the training process. Finally, DSR provides additional interpretability by indicating the optimal sight range used during training. Unlike existing methods that rely on global information or communication mechanisms, our approach operates solely based on the individual sight ranges of agents. This approach offers a practical and efficient solution to the sight range dilemma, making it broadly applicable to real-world complex environments.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

2505.12811

Country: North America > United States (0.93)

Genre: Research Report > New Finding (1.00)

Industry: Leisure & Entertainment > Games > Computer Games (0.35)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

arXiv.org Artificial IntelligenceFeb-27-2025

A Generative Model Enhanced Multi-Agent Reinforcement Learning Method for Electric Vehicle Charging Navigation

Qi, Tianyang, Chen, Shibo, Zhang, Jun

With the widespread adoption of electric vehicles (EVs), navigating for EV drivers to select a cost-effective charging station has become an important yet challenging issue due to dynamic traffic conditions, fluctuating electricity prices, and potential competition from other EVs. The state-of-the-art deep reinforcement learning (DRL) algorithms for solving this task still require global information about all EVs at the execution stage, which not only increases communication costs but also raises privacy issues among EV drivers. To overcome these drawbacks, we introduce a novel generative model-enhanced multi-agent DRL algorithm that utilizes only the EV's local information while achieving performance comparable to these state-of-the-art algorithms. Specifically, the policy network is implemented on the EV side, and a Conditional Variational Autoencoder-Long Short Term Memory (CVAE-LSTM)-based recommendation model is developed to provide recommendation information. Furthermore, a novel future charging competition encoder is designed to effectively compress global information, enhancing training performance. The multi-gradient descent algorithm (MGDA) is also utilized to adaptively balance the weight between the two parts of the training objective, resulting in a more stable training process. Simulations are conducted based on a practical area in Xi\'an, China. Experimental results show that our proposed algorithm, which relies on local information, outperforms existing local information-based methods and achieves less than 8\% performance loss compared to global information-based methods.

algorithm, global state, information, (12 more...)